Speaker verification in score-ageing-quality classification space
نویسندگان
چکیده
A challenge in automatic speaker verification is to create a system that is robust to the effects of vocal ageing. To observe the ageing effect, a speaker’s voice must be analysed over a period of time, over which, variation in the quality of the voice samples is likely to be encountered. Thus, in dealing with the ageing problem, the related issue of quality must also be addressed. We present a solution to speaker verification across ageing by using a stacked classifier framework to combine ageing and quality information with the scores of a baseline classifier. In tandem, the Trinity College Dublin Speaker Ageing database of 18 speakers, each covering a 30-60 year time range, is presented. An evaluation of a baseline Gaussian Mixture Model-Universal Background Model (GMM-UBM) system using this database demonstrates a progressive degradation in genuine speaker verification scores as ageing progresses. Consequently, applying a conventional threshold, determined using scores at the time of enrolment, results in poor long-term performance. The influence of quality on verification scores is investigated via a number of quality measures. Alongside established signal-based measures, a new model-based measure, Wnorm, is proposed, and its utility is demonstrated on the CSLU database. Combining ageing information with quality measures and the scores from the GMM-UBM system, a verification decision boundary is created in score-ageing-quality space. The best performance is achieved by using scores and ageing in conjunction with the new Wnorm quality measure, reducing verification error by 45% relative to the baseline. This work represents the first comprehensive analysis of speaker verification on a longitudinal speaker database and successfully addresses the associated variability from ageing and quality arte1Corresponding author. Tel: +353 1 896 1580; Email: [email protected] Preprint submitted to Computer Speech and Language February 19, 2013
منابع مشابه
Compensating for Ageing and Quality variation in Speaker Verification
Performing speaker verification in the simultaneous presence of ageing progression and changing speech sample quality is an important, open problem. The issues of ageing and quality variation go hand in hand; the effect of ageing increases with time, while variations in quality are also more likely to be encountered as time passes. In this work we demonstrate the effect of ageing on speaker ver...
متن کاملEffects of Long-Term Ageing on Speaker Verification
The changes that occur in the human voice due to ageing have been well documented. The impact of these changes on speaker verification is less clear. In this work, we examine the effect of long-term vocal ageing on a speaker verification system. On a cohort of 13 adult speakers, using a conventional GMM-UBM system, we carry out longitudinal testing of each speaker across a time span of 30-40 ye...
متن کاملEigenageing compensation for speaker verification
Dealing with the effect of vocal ageing on speaker verification is an important challenge. In this paper, a new approach to improving speaker verification performance in the presence of long-term ageing is presented. Analogous to eigenchannel compensation, the proposed eigenageing compensation method operates by adapting a speaker model to a test sample based on a predetermined ageing subspace....
متن کاملQuality-Based Score Normalization for Audiovisual Person Authentication
This paper addresses the problem of biometric audiovisual person authentication in realistic acquisition conditions. Differences in environmental factors or acquisition devices between enrollment and test conditions modify the verification scores distribution and degrade verification performance if not taken into account. A theoretical framework that incorporates quality measures to biometric a...
متن کاملRegression Optimized Kernel for High-level Speaker Verification
Computing the likelihood-ratio (LR) score of a test utterance is an important step in speaker verification. It has recently been shown that for discrete speaker models, the LR scores can be expressed as dot products between supervectors formed by the test utterance, target-speaker model, and background model. This paper leverages this dot-product formulation and the representer theorem to deriv...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computer Speech & Language
دوره 27 شماره
صفحات -
تاریخ انتشار 2013